Río De La Plata
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Jeong, Soyeong, Jung, Taehee, Hwang, Sung Ju, Kim, Joo-Kyung, Kang, Dongyeop
Recent Long-Context Language Models (LCLMs) can process hundreds of thousands of tokens in a single prompt, enabling new opportunities for knowledge-intensive multi-hop reasoning by integrating large sets of retrieved documents or, in some cases, directly all necessary information. However, simply feeding more documents into the context window fails to capture how evidence should be connected. We address this gap with thought templates, which recast reasoning as reusable thought caches, derived from prior problem solving traces, structuring how evidence is combined and guiding multi-hop inference with factual documents. To keep these templates effective, we propose an update strategy that iteratively refines templates derived from training data through natural-language feedback. Across diverse benchmarks and LCLM families, our approach delivers consistent gains over strong baselines in both retrieval-based and retrieval-free settings. Furthermore, we show that optimized templates can be distilled into smaller open-source models, demonstrating its broad applicability and transparent reasoning reuse. We refer to our framework as Thought Template Augmented LCLMs (ToTAL).
- Europe > Austria > Vienna (0.14)
- North America > United States > Ohio (0.05)
- North America > United States > Indiana > Dearborn County (0.04)
- (19 more...)
- Research Report (0.64)
- Overview (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Crossing Borders Without Crossing Boundaries: How Sociolinguistic Awareness Can Optimize User Engagement with Localized Spanish AI Models Across Hispanophone Countries
Capdevila, Martin, Turek, Esteban Villa, Fernandez, Ellen Karina Chumbe, Galvez, Luis Felipe Polo, Marroquin, Andrea, Quesada, Rebeca Vargas, Crew, Johanna, Galarraga, Nicole Vallejo, Rodriguez, Christopher, Gutierrez, Diego, Datla, Radhi
Large language models are, by definition, based on language. In an effort to underscore the critical need for regional localized models, this paper examines primary differences between variants of written Spanish across Latin America and Spain, with an in-depth sociocultural and linguistic contextualization therein. We argue that these differences effectively constitute significant gaps in the quotidian use of Spanish among dialectal groups by creating sociolinguistic dissonances, to the extent that locale-sensitive AI models would play a pivotal role in bridging these divides. In doing so, this approach informs better and more efficient localization strategies that also serve to more adequately meet inclusivity goals, while securing sustainable active daily user growth in a major low-risk investment geographic area. Therefore, implementing at least the proposed five sub variants of Spanish addresses two lines of action: to foment user trust and reliance on AI language models while also demonstrating a level of cultural, historical, and sociolinguistic awareness that reflects positively on any internationalization strategy.
- North America > Central America (0.38)
- South America > Peru (0.06)
- South America > Ecuador (0.06)
- (38 more...)
It's the same but not the same: Do LLMs distinguish Spanish varieties?
Mayor-Rocher, Marina, Pozo, Cristina, Melero, Nina, Martínez, Gonzalo, Grandury, María, Reviriego, Pedro
It's the same but not the same: Do LLMs distinguish Spanish varieties? Abstract: In recent years, large language models (LLMs) have demonstrated a high capacity for understanding and generating text in Spanish. However, with five hundred million native speakers, Spanish is not a homogeneous language but rather one rich in diatopic vari ations spanning both sides of the Atlantic. For this reason, in this study, we evaluate the ability of nine language models to identify and distinguish the morphosyntactic and lexical peculiarities of seven varieties of Spanish (Andean, Antillean, Continen tal Caribbean, Chilean, Peninsular, Mexican and Central American and Rioplatense) through a multiple - choice test. The results indicate that the Peninsular Spanish variety is the best identified by all models and that, among them, GPT - 4o is the only model c apable of recognizing the variability of the Spanish language.
- Europe > Spain > Galicia > Madrid (0.05)
- North America > Mexico (0.05)
- North America > United States > New York (0.05)
- (10 more...)
A Library for Automatic Natural Language Generation of Spanish Texts
García-Méndez, Silvia, Fernández-Gavilanes, Milagros, Costa-Montenegro, Enrique, Juncal-Martínez, Jonathan, González-Castaño, F. Javier
In this article we present a novel system for natural language generation (NLG) of Spanish sentences from a minimum set of meaningful words (such as nouns, verbs and adjectives) which, unlike other state-of-the-art solutions, performs the NLG task in a fully automatic way, exploiting both knowledge-based and statistical approaches. Relying on its linguistic knowledge of vocabulary and grammar, the system is able to generate complete, coherent and correctly spelled sentences from the main word sets presented by the user. The system, which was designed to be integrable, portable and efficient, can be easily adapted to other languages by design and can feasibly be integrated in a wide range of digital devices. During its development we also created a supplementary lexicon for Spanish, aLexiS, with wide coverage and high precision, as well as syntactic trees from a freely available definite-clause grammar. The resulting NLG library has been evaluated both automatically and manually (annotation). The system can potentially be used in different application domains such as augmentative communication and automatic generation of administrative reports or news.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Montenegro (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (14 more...)
Transforming Observations of Ocean Temperature with a Deep Convolutional Residual Regressive Neural Network
Larson, Albert, Akanda, Ali Shafqat
Sea surface temperature (SST) is an essential climate variable that can be measured via ground truth, remote sensing, or hybrid model methodologies. Here, we celebrate SST surveillance progress via the application of a few relevant technological advances from the late 20th and early 21st century. We further develop our existing water cycle observation framework, Flux to Flow (F2F), to fuse AMSR-E and MODIS into a higher resolution product with the goal of capturing gradients and filling cloud gaps that are otherwise unavailable. Our neural network architecture is constrained to a deep convolutional residual regressive neural network. We utilize three snapshots of twelve monthly SST measurements in 2010 as measured by the passive microwave radiometer AMSR-E, the visible and infrared monitoring MODIS instrument, and the in situ Argo dataset ISAS. The performance of the platform and success of this approach is evaluated using the root mean squared error (RMSE) metric. We determine that the 1:1 configuration of input and output data and a large observation region is too challenging for the single compute node and dcrrnn structure as is. When constrained to a single 100 x 100 pixel region and a small training dataset, the algorithm improves from the baseline experiment covering a much larger geography. For next discrete steps, we envision the consideration of a large input range with a very small output range. Furthermore, we see the need to integrate land and sea variables before performing computer vision tasks like those within. Finally, we see parallelization as necessary to overcome the compute obstacles we encountered.
- Southern Ocean (0.04)
- Pacific Ocean (0.04)
- Indian Ocean > Bay of Bengal (0.04)
- (8 more...)
DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models
A few benchmarking datasets have been released to evaluate the factual knowledge of pretrained language models. These benchmarks (e.g., LAMA, and ParaRel) are mainly developed in English and later are translated to form new multilingual versions (e.g., mLAMA, and mParaRel). Results on these multilingual benchmarks suggest that using English prompts to recall the facts from multilingual models usually yields significantly better and more consistent performance than using non-English prompts. Our analysis shows that mLAMA is biased toward facts from Western countries, which might affect the fairness of probing models. We propose a new framework for curating factual triples from Wikidata that are culturally diverse. A new benchmark DLAMA-v1 is built of factual triples from three pairs of contrasting cultures having a total of 78,259 triples from 20 relation predicates. The three pairs comprise facts representing the (Arab and Western), (Asian and Western), and (South American and Western) countries respectively. Having a more balanced benchmark (DLAMA-v1) supports that mBERT performs better on Western facts than non-Western ones, while monolingual Arabic, English, and Korean models tend to perform better on their culturally proximate facts. Moreover, both monolingual and multilingual models tend to make a prediction that is culturally or geographically relevant to the correct label, even if the prediction is wrong.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > France (0.06)
- South America > Brazil (0.05)
- (77 more...)
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Ghosal, Deepanway, Majumder, Navonil, Mehrish, Ambuj, Poria, Soujanya
The immense scale of the recent large language models (LLM) allows many interesting properties, such as, instruction- and chain-of-thought-based fine-tuning, that has significantly improved zero- and few-shot performance in many natural language processing (NLP) tasks. Inspired by such successes, we adopt such an instruction-tuned LLM Flan-T5 as the text encoder for text-to-audio (TTA) generation -- a task where the goal is to generate an audio from its textual description. The prior works on TTA either pre-trained a joint text-audio encoder or used a non-instruction-tuned model, such as, T5. Consequently, our latent diffusion model (LDM)-based approach TANGO outperforms the state-of-the-art AudioLDM on most metrics and stays comparable on the rest on AudioCaps test set, despite training the LDM on a 63 times smaller dataset and keeping the text encoder frozen. This improvement might also be attributed to the adoption of audio pressure level-based sound mixing for training set augmentation, whereas the prior methods take a random mix.
- Asia > Singapore (0.04)
- South America > Uruguay (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (3 more...)
- Media > Music (0.46)
- Leisure & Entertainment (0.46)
How Deep Learning is helping to save human lives at a container terminal
The Port of Montevideo is located in the capital city of Montevideo, on the banks of the "Río de la Plata" river. Due to its strategic location between the Atlantic Ocean and the "Uruguay" river, it is considered one of the main routes of cargo mobilization for Uruguay and MERCOSUR . Over the past decades, it has established itself as a multipurpose port handling: containers, bulk, fishing boats, cruises, passenger transport, cars, and general cargo. MERCOSUR or officially the Southern Common Market is a commercial and political bloc established in 1991 by several South American countries. Moreover, only two companies concentrate all-cargo operations in this port: the company of Belgian origin Katoen Natie and the Chilean and Canadian capital company Montecon.
- South America > Uruguay > Montevideo > Montevideo (0.46)
- North America > Canada (0.25)
- Atlantic Ocean > South Atlantic Ocean > Río De La Plata (0.25)
- Law (0.71)
- Transportation (0.55)
Narrative Cartography with Knowledge Graphs
Mai, Gengchen, Huang, Weiming, Cai, Ling, Zhu, Rui, Lao, Ni
Narrative cartography is a discipline which studies the interwoven nature of stories and maps. However, conventional geovisualization techniques of narratives often encounter several prominent challenges, including the data acquisition & integration challenge and the semantic challenge. To tackle these challenges, in this paper, we propose the idea of narrative cartography with knowledge graphs (KGs). Firstly, to tackle the data acquisition & integration challenge, we develop a set of KG-based GeoEnrichment toolboxes to allow users to search and retrieve relevant data from integrated cross-domain knowledge graphs for narrative mapping from within a GISystem. With the help of this tool, the retrieved data from KGs are directly materialized in a GIS format which is ready for spatial analysis and mapping. Two use cases - Magellan's expedition and World War II - are presented to show the effectiveness of this approach. In the meantime, several limitations are identified from this approach, such as data incompleteness, semantic incompatibility, and the semantic challenge in geovisualization. For the later two limitations, we propose a modular ontology for narrative cartography, which formalizes both the map content (Map Content Module) and the geovisualization process (Cartography Module). We demonstrate that, by representing both the map content and the geovisualization process in KGs (an ontology), we can realize both data reusability and map reproducibility for narrative cartography.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- North America > United States > New York (0.04)
- Europe > Spain > Canary Islands (0.04)
- (20 more...)
- Information Technology > Services (0.92)
- Government (0.91)
- Information Technology > Communications > Web > Semantic Web (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.88)
OracleVoice: AI And Other New Technologies Make 'Smart Cities' Even Smarter
As people increasingly migrate to cities in search of jobs, services, and other urban benefits, local governments are turning to emerging technologies to respond to the pressures of their growing populations. Tech-savvy "smart cities" are reacting to heightened demands on scarce resources by developing new capabilities such as artificial intelligence (AI) and sensor-driven analytics to resolve myriad challenges, from crime to congestion. The new insights that result are helping city managers look at old problems in a new light, while cloud computing is making these efforts affordable and realistic. Analytics, for example, can help cities use existing resources more efficiently, according to Joel Cherkis, a group vice president at Oracle. "Traditionally, when crime rates go up, cities hire more police," he says.
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.07)
- Europe > United Kingdom > England (0.05)
- Atlantic Ocean > South Atlantic Ocean > Río De La Plata (0.05)
- (3 more...)